Active and unsupervised learning for automatic speech recognition

نویسندگان

  • Giuseppe Riccardi
  • Dilek Z. Hakkani-Tür
چکیده

State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples, and then selecting the most informative ones with respect to a given cost function. For unsupervised learning, we utilize the remaining untranscribed data by using their ASR output and word confidence scores. Our experiments show that the amount of labeled data needed for a given word accuracy can be reduced by 75% by combining active and unsupervised learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active and Unsupervised Learning for A

State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active...

متن کامل

This is a placeholder. Final title will be filled later

State-of-the-art speech recognition systems are trained using human transcriptions of speech utterances. In this paper, we describe a method to combine active and unsupervised learning for automatic speech recognition (ASR). The goal is to minimize the human supervision for training acoustic and language models and to maximize the performance given the transcribed and untranscribed data. Active...

متن کامل

Unsupervised training and directed manual transcription for LVCSR

A significant cost in obtaining acoustic training data is the generation of accurate transcriptions. When no transcription is available, unsupervised training techniques must be used. Furthermore, the use of discriminative training has become a standard feature of state-ofthe-art large vocabulary continuous speech recognition (LVCSR) system. In unsupervised training, unlabelled data are recogni...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Optimizing Data Selection for Automatic Speech Recognition in Low Resource Languages

Developing Automatic Speech Recognition (ASR) systems for low resource languages is a labor-, computation-, and timeintensive task. Data selection techniques seek highly informative subsets of speech data for transcription and can lead to considerable reduction in time and expense for transcription and ASR training. This project investigates unsupervised and supervised data selection techniques...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003